Goto

Collaborating Authors

 regression technique


Sparse Regression for Machine Translation

Biçici, Ergun

arXiv.org Artificial Intelligence

We use transductive regression techniques to learn mappings between source and target features of given parallel corpora and use these mappings to generate machine translation outputs. We show the effectiveness of $L_1$ regularized regression (\textit{lasso}) to learn the mappings between sparsely observed feature sets versus $L_2$ regularized regression. Proper selection of training instances plays an important role to learn correct feature mappings within limited computational resources and at expected accuracy levels. We introduce \textit{dice} instance selection method for proper selection of training instances, which plays an important role to learn correct feature mappings for improving the source and target coverage of the training set. We show that $L_1$ regularized regression performs better than $L_2$ regularized regression both in regression measurements and in the translation experiments using graph decoding. We present encouraging results when translating from German to English and Spanish to English. We also demonstrate results when the phrase table of a phrase-based decoder is replaced with the mappings we find with the regression model.


Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression

Neural Information Processing Systems

Generalized Linear Models (GLMs) and Single Index Models (SIMs) provide powerful generalizations of linear regression, where the target variable is assumed to be a (possibly unknown) 1-dimensional function of a linear predictor. In general, these problems entail non-convex estimation procedures, and, in practice, iterative local search heuristics are often used. Kalai and Sastry (2009) provided the first provably efficient method, the Isotron algorithm, for learning SIMs and GLMs, under the assumption that the data is in fact generated under a GLM and under certain monotonicity and Lipschitz (bounded slope) constraints.


b3ba8f1bee1238a2f37603d90b58898d-Reviews.html

Neural Information Processing Systems

First of all, we would like to thank the reviewers for their comments and insights. We have proposed to cast audio-based content recommendation as a latent factor regression problem using usage data as ground truth for training. We believe this is a novel idea, and by formulating the problem in this fashion we are able to build upon a large pre-existing body of research on regression techniques, as well as latent factor models. As a result, we of course agree with the reviewers that our paper does not contain any purely algorithmic machine learning contributions. However, we consider the fact that this idea can be applied with any kind of regression technique or latent factor model to be an advantage of our approach.


ae5e3ce40e0404a45ecacaaf05e5f735-Reviews.html

Neural Information Processing Systems

We are grateful to the reviewers for their careful reading of our manuscript and their suggestions. KCSD is a complex object to study and an important aspect of our contribution is to estimate its properties with good asymptotic result under mild conditions. We introduce an unbiased estimate and a statistical test using fast algorithms which are easily applicable to many datasets. Our results cannot be found elsewhere and further work can build on our mathematical treatment to assess statistical properties of kernel methods for stationary data. Most importantly, this contribution aims at bringing results from kernel methods to communities that are in important need for general time series analysis techniques with good statistical properties. Measures describing the dependency structure of the data without model assumptions, such as the linear cross-spectrum, became standard in applications such as Neurophysiology.


Support Vector Regression Machines

Neural Information Processing Systems

A new regression technique based on Vapnik's concept of support vectors is introduced. We compare support vector regression (SVR) with a committee regression technique (bagging) based on regression trees and ridge regression done in feature space. On the basis of these experiments, it is expected that SVR will have advantages in high dimensionality space because SVR optimization does not depend on the dimensionality of the input space.


Mastering the Art of Linear Regression: A Comprehensive Guide

#artificialintelligence

Linear regression is a statistical technique for modeling the relationship between a dependent variable and one or more independent variables. At its core, linear regression is a method for predicting a numerical outcome based on a set of input variables. But what exactly is linear regression and how does it work? In this article, we'll delve into the fundamentals of linear regression and explore its applications in a variety of fields, including economics, finance, and machine learning. We'll also discuss some of the key challenges and limitations of using linear regression, and provide practical tips for implementing it in your own analyses.


Boston House Price Prediction Using Machine Learning

#artificialintelligence

Hello Everyone My Name is Nivitus. Welcome to the Boston House Price Prediction Tutorial. This is another Machine Learning Blog on Medium Site. I hope all of you like this blog; ok I don't wanna waste your time. Let's get ready for jump into this Journey.


REGRESSION -- HOW, WHY, AND WHEN? – Towards AI

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. As we previously saw, the supervised part of machine learning is separated into two categories, and from those two categories, we have already ventured into the realm of classification and the many algorithms employed in the classification process.


Experimental Analysis of Machine Learning Techniques for Finding Search Radius in Locality Sensitive Hashing

Jafari, Omid, Nagarkar, Parth

arXiv.org Artificial Intelligence

Finding similar data in high-dimensional spaces is one of the important tasks in multimedia applications. Approaches introduced to find exact searching techniques often use tree-based index structures which are known to suffer from the curse of the dimensionality problem that limits their performance. Approximate searching techniques prefer performance over accuracy and they return good enough results while achieving a better performance. Locality Sensitive Hashing (LSH) is one of the most popular approximate nearest neighbor search techniques for high-dimensional spaces. One of the most time-consuming processes in LSH is to find the neighboring points in the projected spaces. An improved LSH-based index structure, called radius-optimized Locality Sensitive Hashing (roLSH) has been proposed to utilize Machine Learning and efficiently find these neighboring points; thus, further improve the overall performance of LSH. In this paper, we extend roLSH by experimentally studying the effect of different types of famous Machine Learning techniques on overall performance. We compare ten regression techniques on four real-world datasets and show that Neural Network-based techniques are the best fit to be used in roLSH as their accuracy and performance trade-off are the best compared to the other techniques.


Estimating a Book's Publication Date with Artificial Intelligence

#artificialintelligence

You're probably aware of AI's increasing ability to analyze and synthesize human language, such as the recent controversy over whether a Google chatbot is, in fact, sentient (Google claims -- and I'm inclined to believe -- that the chatbot is just very, very good at recognizing and replicating speech patterns). Since AI is so skilled at analyzing language, I wondered whether it could detect changes in language over time. Could it differentiate between texts written in, say, the 12th century and the 18th century? As it turns out, it can! To build this model, I used natural language processing, the branch of machine learning dedicated to (you guessed it!)